503 research outputs found

    Dynamic feature selection for clustering high dimensional data streams

    Get PDF
    open access articleChange in a data stream can occur at the concept level and at the feature level. Change at the feature level can occur if new, additional features appear in the stream or if the importance and relevance of a feature changes as the stream progresses. This type of change has not received as much attention as concept-level change. Furthermore, a lot of the methods proposed for clustering streams (density-based, graph-based, and grid-based) rely on some form of distance as a similarity metric and this is problematic in high-dimensional data where the curse of dimensionality renders distance measurements and any concept of “density” difficult. To address these two challenges we propose combining them and framing the problem as a feature selection problem, specifically a dynamic feature selection problem. We propose a dynamic feature mask for clustering high dimensional data streams. Redundant features are masked and clustering is performed along unmasked, relevant features. If a feature's perceived importance changes, the mask is updated accordingly; previously unimportant features are unmasked and features which lose relevance become masked. The proposed method is algorithm-independent and can be used with any of the existing density-based clustering algorithms which typically do not have a mechanism for dealing with feature drift and struggle with high-dimensional data. We evaluate the proposed method on four density-based clustering algorithms across four high-dimensional streams; two text streams and two image streams. In each case, the proposed dynamic feature mask improves clustering performance and reduces the processing time required by the underlying algorithm. Furthermore, change at the feature level can be observed and tracked

    A test problem for visual investigation of high-dimensional multi-objective search

    Get PDF
    An inherent problem in multiobjective optimization is that the visual observation of solution vectors with four or more objectives is infeasible, which brings major difficulties for algorithmic design, examination, and development. This paper presents a test problem, called the Rectangle problem, to aid the visual investigation of high-dimensional multiobjective search. Key features of the Rectangle problem are that the Pareto optimal solutions 1) lie in a rectangle in the two-variable decision space and 2) are similar (in the sense of Euclidean geometry) to their images in the four-dimensional objective space. In this case, it is easy to examine the behavior of objective vectors in terms of both convergence and diversity, by observing their proximity to the optimal rectangle and their distribution in the rectangle, respectively, in the decision space. Fifteen algorithms are investigated. Underperformance of Pareto-based algorithms as well as most state-of-the-art many-objective algorithms indicates that the proposed problem not only is a good tool to help visually understand the behavior of multiobjective search in a high-dimensional objective space but also can be used as a challenging benchmark function to test algorithms' ability in balancing the convergence and diversity of solutions

    Interactive and non-interactive hybrid immigrants schemes for ant algorithms in dynamic environments

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Dynamic optimization problems (DOPs) have been a major challenge for ant colony optimization (ACO) algorithms. The integration of ACO algorithms with immigrants schemes showed promising results on different DOPs. Each type of immigrants scheme aims to address a DOP with specific characteristics. For example, random and elitism-based immigrants perform well on severely and slightly changing environments, respectively. In this paper, two hybrid immigrants, i.e., non-interactive and interactive, schemes are proposed to combine the merits of the aforementioned immigrants schemes. The experiments on a series of dynamic travelling salesman problems showed that the hybridization of immigrants further improves the performance of ACO algorithms

    Dynamic railway junction rescheduling using population based ant colony optimisation

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Efficient rescheduling after a perturbation is an important concern of the railway industry. Extreme delays can result in large fines for the train company as well as dissatisfied customers. The problem is exacerbated by the fact that it is a dynamic one; more timetabled trains may be arriving as the perturbed trains are waiting to be rescheduled. The new trains may have different priorities to the existing trains and thus the rescheduling problem is a dynamic one that changes over time. The aim of this research is to apply a population-based ant colony optimisation algorithm to address this dynamic railway junction rescheduling problem using a simulator modelled on a real-world junction in the UK railway network. The results are promising: the algorithm performs well, particularly when the dynamic changes are of a high magnitude and frequency

    Genetic algorithms with memory- and elitism-based immigrants in dynamic environments

    Get PDF
    Copyright @ 2008 by the Massachusetts Institute of TechnologyIn recent years the genetic algorithm community has shown a growing interest in studying dynamic optimization problems. Several approaches have been devised. The random immigrants and memory schemes are two major ones. The random immigrants scheme addresses dynamic environments by maintaining the population diversity while the memory scheme aims to adapt genetic algorithms quickly to new environments by reusing historical information. This paper investigates a hybrid memory and random immigrants scheme, called memory-based immigrants, and a hybrid elitism and random immigrants scheme, called elitism-based immigrants, for genetic algorithms in dynamic environments. In these schemes, the best individual from memory or the elite from the previous generation is retrieved as the base to create immigrants into the population by mutation. This way, not only can diversity be maintained but it is done more efficiently to adapt genetic algorithms to the current environment. Based on a series of systematically constructed dynamic problems, experiments are carried out to compare genetic algorithms with the memory-based and elitism-based immigrants schemes against genetic algorithms with traditional memory and random immigrants schemes and a hybrid memory and multi-population scheme. The sensitivity analysis regarding some key parameters is also carried out. Experimental results show that the memory-based and elitism-based immigrants schemes efficiently improve the performance of genetic algorithms in dynamic environments.This work was supported by the Engineering and Physical Sciences Research Council (EPSRC) of the United Kingdom under Grant EP/E060722/01

    Finding and tracking multi-density clusters in an online dynamic data stream

    Get PDF
    The file attached to this record is the author's final peer reviewed version.Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream. Density-based clustering identifies clusters as areas of high density separated by areas of low density. This paper proposes a Multi-Density Stream Clustering (MDSC) algorithm to address these two problems; the multi-density problem and the problem of discovering and tracking changes in a dynamic stream. MDSC consists of two on-line components; discovered, labelled clusters and an outlier buffer. Incoming points are assigned to a live cluster or passed to the outlier buffer. New clusters are discovered in the buffer using an ant-inspired swarm intelligence approach. The newly discovered cluster is uniquely labelled and added to the set of live clusters. Processed data is subject to an ageing function and will disappear when it is no longer relevant. MDSC is shown to perform favourably to state-of-the-art peer stream-clustering algorithms on a range of real and synthetic data-streams. Experimental results suggest that MDSC can discover qualitatively useful patterns while being scalable and robust to noise

    A benchmark generator for dynamic multi-objective optimization problems

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Many real-world optimization problems appear to not only have multiple objectives that conflict each other but also change over time. They are dynamic multi-objective optimization problems (DMOPs) and the corresponding field is called dynamic multi-objective optimization (DMO), which has gained growing attention in recent years. However, one main issue in the field of DMO is that there is no standard test suite to determine whether an algorithm is capable of solving them. This paper presents a new benchmark generator for DMOPs that can generate several complicated characteristics, including mixed Pareto-optimal front (convexity-concavity), strong dependencies between variables, and a mixed type of change, which are rarely tested in the literature. Experiments are conducted to compare the performance of five state-of-the-art DMO algorithms on several typical test functions derived from the proposed generator, which gives a better understanding of the strengths and weaknesses of these tested algorithms for DMOPs

    Ant colony optimization for scheduling walking beam reheating furnaces

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.This paper presents a new mathematical model for the walking beam reheating furnace scheduling problem (WBRFSP) in an iron and steel plant, which allows the mixed package of hot and cold slabs and aims to minimize the energy consumption and increase the product quality. An ant colony optimization (ACO) algorithm is designed to solve this model. Simulation results based on the data derived from the field data of an iron and steel plant show the effectiveness of the proposed model and algorithm

    Evolutionary algorithms in dynamic environments

    Get PDF
    The file attached to this record is the author's final peer reviewed version.Evolutionary algorithms (EAs) are widely and often used for solving stationary optimization problems where the fitness landscape or objective function does not change during the course of computation. However, the environments of real world optimization problems may fluctuate or change sharply. If the optimization problem is dynamic, the goal is no longer to find the extrema, but to track their progression through the search space as closely as possible. All kinds of approaches that have been proposed to make EAs suitable for the dynamic environments are surveyed, such as increasing diversity, maintaining diversity, memory based approaches, multi-population approaches and so on
    corecore